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Abstract 

We propose a new node centrality measure in networks, the lobby 
index, which is inspired by Hirsch's /i-index. It is shown that in scale 
free networks with exponent a the distribution of the /-index has power 
tail with exponent a{a + 1). Properties of the /-index and extensions 
are discussed. 



Efficient communication means high impact (wide access or high reach) 
and low cost. This goal is common in communication networks, in society 
and in biological systems. In the course of time many centrality measures 
proposed to characterize a node's role, position, or influence in a network 
but none of them capture the efficiency of communication. This paper is 
intended to fill this gap and propose a new centrality measure, the lobby 
index. 

Hirsch [?] proposed the /i-index: "the number of papers with citation 
number > h, as a useful index to characterize the scientific output of a re- 
searcher". Barabasi & al [?] devised a very simple network model which has 
several key properties: most importantly the degree distribution has a power- 
law upper tail, the node degrees are independent, and typical nodes are close 
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to each other. Schubert [?] used the h-index as a network indicator, particu- 
larly in scale-free networks. This paper is devoted to the characterization of 
network nodes with a h-index type measure. 

Definition 1 The l-index or lobby index of a node x is the largest integer 
k such that x has at least k neighbors with a degree of at least k. (See also 

In what follows some properties of the lobby index are investigated; it is 
shown that in Scale Free (SF) networks, with exponent a, the distribution of 
the /-index has a fat tail with exponent a{a + 1). Furthermore the empirical 
distribution of the /-index in generated and real life networks is investigated 
and some further extensions are discussed. 

1 Centrality measures 

Freeman's prominent paper [?] (1979) pointed out that: "Over the years, a 
great many measures of centrality have been proposed. The several measures 
are often only vaguely related to the intuitive ideas they purport to index, 
and many are so complex that it is difficult or impossible to discover what, 
if anything, they are measuring." It is perhaps worth noting that research in 
this field dates back to Bavelas [?] (1949). 

At the time of Freeman's paper most centrality measures were equivalents 
or modifications of the three major and widely accepted indexes, the degree 
(cf.[?]), closeness (cf. [?] and [?]) and betweenness (cf. [?]) centrality (see 
also Borgatti [?]). 

As time passed, many new centrality measures were proposed. After 
years of research and application, the above three and eigenvector centrality 
(a variant of which computer scientists call PageRank [?] and Google uses 
to rank search results) can be said to have become a standard; the others 
are not widely used. The historical three and eigenvector centrality are thus 
the conceptual base for investigating centrality behavior of nodes and full 
networks. 

Notwithstanding Freeman's wise warning the present paper proposes the 
lobby centrality (index) in the belief that Hirsch's insight into publication 
activity (which produces the citation network) has an interesting and relevant 
message to network analysis in general. 
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The diplomat's dilemma It is clear that a person has strong lobby power, 
the ability to influence people's opinions, if he or she has many highly con- 
nected neighbors. This is exactly the aim of a lobbyist or a diplomat [?] . The 
diplomat's goal is to have strong influence on the community while keeping 
the number of his connections (which have a cost) low. If x has a high lobby 
index, then the /-core L (x) (those neighbors which provide the index) has 
high connectivity (statistically higher than / (x) , see (jH]) and the comment 
there). In this sense, the 1- index is closely related to the solution of the 
diplomat's dilemma. 

Communication networks Research of communication networks and net- 
work topology are in interaction. Node centrality measures are essential in 
the study of net mining [?], malware detection [?], in reputation based peer 
to peer systems [?] , delay tolerant networks [?] and others (see [?] , [?] and the 
references therein). We expect that in the case of social and communication 
networks (some of which are also based on social networks) the lobby-index 
is located between the bridgeness [?], closeness, eigenvector and between- 
ness centrality. Based on this intermediate position of the lobby index we 
expect that it can be a useful aid in developing good defence and immuniza- 
tion strategies for peer to peer networks as well as help create more efficient 
broadcasting schemes in sensor networks and marketing or opinion shaping 
strategies. 

The distribution of the I -index Let us consider scale-free networks and 
assume that the node degrees are independent. The degree is denoted by 
deg (x) for nodes and the /-index is defined as follows. Let us consider all i/i 
neighbors of x so that deg (yi) > deg (1/2) then, 

/ (x) = max {k : deg (yk) > k} . (1) 

Theorem 

If the vertex degrees are independent and P (deg (x) > k) ^ ck~'^ for all 
nodes x, then 



(2) 

for all nodes x, 

The proof is provided in the Appendix. 

^Here and in what follows a„ « 5„ means that |^ — > c as n ^ 00 and a„ ~ 6„ means 
that there is a C > 1 such that for all n, ^ < p^- < C. 



P(/(x) > A;) ~ A;-"("+i) 
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The Hirsch index The original Hirsch index is based on a richer modeh 
author ^ paper and paper ^ citing paper hnks. Let x be a randomly chosen 
author of the scientific community under scrutiny and n = n{x) is the number 
of his/her papers (either in general or within a defined period). Let yi denote 
the individual papers (where i = 1, ...n,) and c{yi) their citation score (in 
decreasing order), so that c{yi) > 0(7/2) > ••• > ciUn)- h{x) is the Hirsch 
index of x : 

h (x) = max {k : c (yt) > k} . 

Assume that the paper productivity has an a-fat tail: G[ = P (n (x) > /) ~ 
c/~" and the citation score has a /3-fat tail: 

Gf =F{c{y) > I) ^cl-^. 

Along the lines of the argument that led to ([2]) one can see that h has an 
a{P + l)-fat tail [?] : 



P {h (x) > A;) ~ (3) 



How good is an l-index of k? If a node x of degree n has an Z-index of /c, 
Glanzel's [?] observation provides a preliminary assessment of this value: 

l{x) ^ c deg (x) (4) 

where a is the tail exponent of the degree distribution. Consequently a 
lobbyist is doing a good job of solving the diplomat's dilemma if / (x) 3> 
deg (x) . On the other hand our result shows that I (x) > k means that x 
belongs to the top lOOcaA;""*^""^^-' percent of lobbyists. 

The lobby gain The performance of a lobbyist is indicated by a measure 
called the lobby gain. The lobby gain shows how the access to the network 
is multiplied using a link to the /-core. Let us use the notation Di {y) = 
{z : d{x,y) = i} and set D2 (x) = Uy^L(x)Di (y) \ [Di (x) U {x}] then the 
number of second neighbors reachable via the /-core is deg2 (x) = | D2 (x) | 
and the lobby gain is defined as 

The lobby gain Ti (x) is much larger than one if a typical link to the /-core 
provides a lot of connections to the rest of the network for x via that link. 
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It can be shown (see [?]) that the number of second neighbors reachable via 
the /-core (with multiphcity) is I (x)'^. 

The degree distribution within the l-core The influential acquaintances of 
a given lobbyist follow a fat tail distribution provided the underlying network 
is SF. In other words if y e L (x) and I — k then the truncated distribution 
(by k) of the degree distribution of y again follows a fat tail distribution: for 
m > A; > 

P (deg (y) > m\y G L (x) and I = k) c ^— j . (6) 

Let us note that this conditional or truncated distribution has a higher ex- 
pected value than the original one. 



2 Network examples 

The analysis of different networks received particular attention in the last 
decades. The research goals and tools vary greatly. Here we regress to the 
roots and consider some "classic" networks and study the distribution of 
their lobby index. 

Generated scale-free networks We have generated 50 20000-node Barabasi 
(BA) networks [?] with 10 new links each step, starting with 10 initial nodes. 
The degree distribution passed the preliminary test and has a — 1.96, i.e. 
a 1.96-fat tail. As Figure 1. shows, the empirical distribution of the lobby 
index has rj — 5.14 while the theory predicts 77 = 5.76. 
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The log-log plots for the distribution of l-index 



We have used the generahzed Barabasi model [?] which can provide ar- 
bitrary a > 1. We generated 50 graphs of 10000 nodes with the proposed 
algorithm and obtained networks with a — 1.9186 in average, which would 
imply 7] = 5.60; we observed rj = 5.28. 

Let us remark that the estimate of the tail exponent of fat tail distribu- 
tions has a sophisticated technique [?] superior to the line fit on the log-log 
scale. The careful analysis and application of these methods to the Z-index 
will be published elsewhere. 

The AS level graph The Autonomous System (AS) level of the Internet 
infrastructure has already been investigated in depth (c.f. [?] and its bibli- 
ography). It turned out that it not only has a scale-free degree distribution 
but displays the rich club phenomenon as well. High degree nodes are more 
densely interlinked than expected in a BA graph. The standard choice for 
AS a source of sample data is the CAIDA [?] project. We determined the 
exponent of the tail of the degree distribution and compared it with the ex- 
ponent of the tail of the empirical distribution of the l-index. We found 
that a — 1.61 and rj = a{a + 1) — 4.21 and rj — 4.14 is estimated from the 
empirical distribution of I. 
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The IG model Mondragon &al [?] proposed a modification of tlie Barabasi 
network model, tlie Interactive Growtli (IG) model to generate scale-free 
networks which exhibit the rich club behavior. In each iteration, a new node 
is linked to one or two existing nodes (hosts). In the first case the host node 
is connected to two additional peer nodes using the preferential attachment 
scheme while in the latter case only one of the involved (randomly chosen) 
hosts is connected to a new peer. We implemented this algorithm and again 
compared the exponents extracted from sample data. In this case the network 
size was 3000 and the probability of one host was 0.4 and of two hosts was 0.6. 
Again the log-log fit of the degree distribution tail yielded a = 1.23, r] = 2.74 
and rj = 2.45 given by the empirical distribution of /. 

The place of the lobby index among other centrality measures As we al- 
ready indicated above the lobby index lies somewhere between the closeness 
centrality (c/), betweenness (bw) and eigenvector (ev) centrality. Strong cor- 
relation with degree centrality is out of the question in the light of (jl]). In 
order to gain a better picture on the behavior of the lobby index we deter- 
mined the Spearman correlation between these centrality measures in the AS 
graph. 
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Table 1. Spearman rank correlation 



The correlations in Table 1. indicate that the /-index contains a well bal- 
anced mix of other centrality measures; the /-index is slightly closer to the 
three "classical" centralities than they are to each other (the quadratic mean 
of the three correlations, in boldface, is 0.638 while the quadratic mean of of 
the correlations, in italic, with the /-index is 0.678). The Kendall correlations 
of the investigated centralities have been calculated and yielded a very sim- 
ilar picture. For biological networks the Spearman correlation between the 
closeness and eigenvector centrality is high (c.f. [?]); high Pearson correlation 
can be observed on other networks as well (c.f. [?]). One centrality measure 
can be used to approximate the other, which is not the case with the /-index 
for the AS graph, but may happen in other types of networks. This will 
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save computation time given the simplicity of the calculation of the /-index. 
Freeman's paper [?] as well as [?] and [?] are calls for a further analysis of 
centrality measures and the /-index on different types of networks. 

Conclusion A new centrality measure, the /-index is proposed and exam- 
ined. It is shown that the distribution of the /-index has a{a + l)-fat tail 
of SF networks with exponent a. There is a good match between empirical 
observations (collected in Table 2) and the theoretical result. In this case 
the aim of the empirical results was not to verify the theory but to emphasize 
that the investigated networks behave in the expected way with respect to 
the lobby index. 
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Table 2. The tail exponents of networks 



The lobby index is placed on the map along other centrality measures. Some 
further extensions and properties are discussed as well: the relation to the 
diplomat's dilemma is investigated and the lobby index is demonstrated to 
be a good performance measure for lobbyists. 
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3 Appendix 

In what follows we provide a rigorous derivation of (l2])|§. Let us use the 
notation Ik = P (/ (x) = k) for the distribution of the /-index, and Gk = 

^Henceforth c will be an arbitrary positive constant unless specified otherwise. Its 
value may change from occurrence to occurrence. 
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F{deg{x)>k)^l-Fk. 

oo 

Ik = ^P(/(a;) = A;,deg(a;) = + 

oo 

= ^ P (/ (x) = k\ deg (x)^k + l)F (deg (x) = + /) 

1=0 

Partition of unity and conditional probability is used (Bayes Theorem) . As a 
result we have to investigate what is the probability that a node has k links, 
each has degree > k and / other links with degree not higher then k given 
that it has k + / links in total. That criteria makes exactly / (x) — k. 
First we develop a lower estimate for Ik- 

1=0 V * / 

We estimate Ik using 1 — ciA;~° ^ Q-a''"^ and (*'^') > ^ 



oo 



i=0 

„kv,—ak poo 

> c^4l— / x'-^^+'^e-'^dx 
k\ Jo 

A;! 

^ j^fc+i/2gfc+e/i2(fc) 



where the Stirling formula has been used for F (fc — a) and fc! as well and 
< O < 1. The upper estimate works similarly as follows. 

ik-{Gk)'f:{k+ir'-^'^ (^1^) (W. 
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Introducing a new variable one obtains 
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where at the end the Stirhng formulas have been used as in the lower estimate. 
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